Search CORE

11 research outputs found

The 3+1 decomposition of Conformal Yano-Killing tensors and "momentary" charges for spin-2 field

Author: Jezierski Jacek
Migacz Szymon
Publication venue: 'IOP Publishing'
Publication date: 26/04/2014
Field of study

The "fully charged" spin-2 field solution is presented. This is an analog of the Coulomb solution in electrodynamics and represents the "non-waving" part of the spin-2 field theory. Basic facts and definitions of the spin--2 field and conformal Yano-Killing tensors are introduced. Application of those two objects provides a precise definition of quasi-local gravitational charge. Next, the 3+1 decomposition leads to the construction of the momentary gravitational charges on initial surface which is applicable for Schwarzschild-like spacetimes.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

Tiered Pruning for Efficient Differentialble Inference-Aware Neural Architecture Search

Author: Florea Alex-Fit
Fridman Denys
Kierat Sławomir
Migacz Szymon
Morkisz Paweł
Sieniawski Mateusz
Yu Chen-Han
Publication venue
Publication date: 03/10/2022
Field of study

We propose three novel pruning techniques to improve the cost and results of inference-aware Differentiable Neural Architecture Search (DNAS). First, we introduce , a stochastic bi-path building block for DNAS, which can search over inner hidden dimensions with memory and compute complexity. Second, we present an algorithm for pruning blocks within a stochastic layer of the SuperNet during the search. Third, we describe a novel technique for pruning unnecessary stochastic layers during the search. The optimized models resulting from the search are called PruNet and establishes a new state-of-the-art Pareto frontier for NVIDIA V100 in terms of inference latency for ImageNet Top-1 image classification accuracy. PruNet as a backbone also outperforms GPUNet and EfficientNet on the COCO object detection task on inference latency relative to mean Average Precision (mAP)

arXiv.org e-Print Archive

Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training

Author: Ebrahimi Eiman
Fu Yaosheng
Gupta Puneet
Migacz Szymon
Nellans David
Pal Saptadeep
Zhang Victor
Zulfiqar Arslan
Publication venue
Publication date: 30/07/2019
Field of study

Deploying deep learning (DL) models across multiple compute devices to train large and complex models continues to grow in importance because of the demand for faster and more frequent training. Data parallelism (DP) is the most widely used parallelization strategy, but as the number of devices in data parallel training grows, so does the communication overhead between devices. Additionally, a larger aggregate batch size per step leads to statistical efficiency loss, i.e., a larger number of epochs are required to converge to a desired accuracy. These factors affect overall training time and beyond a certain number of devices, the speedup from leveraging DP begins to scale poorly. In addition to DP, each training step can be accelerated by exploiting model parallelism (MP). This work explores hybrid parallelization, where each data parallel worker is comprised of more than one device, across which the model dataflow graph (DFG) is split using MP. We show that at scale, hybrid training will be more effective at minimizing end-to-end training time than exploiting DP alone. We project that for Inception-V3, GNMT, and BigLSTM, the hybrid strategy provides an end-to-end training speedup of at least 26.5%, 8%, and 22% respectively compared to what DP alone can achieve at scale

arXiv.org e-Print Archive

SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

Author: Abadi Martín
Almeida Mario
Guo Chuan
Han Song
Hazelwood K.
He K
Hsieh Kevin
Hu C.
Huang Gao
Jacob B.
Kaya Yigitcan
Kouris A.
Kouris A.
Kouris A.
Kozyrakis C.
Lane N. D.
Laskaridis Stefanos
Lee Royson
Li E.
Li Hao
Li Hongshan
Liu Yizhi
Migacz Szymon
Nair Vinod
Nikolić Miloš
Norman
Oakes Edward
Raghu Maithra
Rhu M.
Simonyan K.
Smolyanskiy N.
Stock Pierre
Szegedy C.
Szegedy Christian
Teerapittayanon S.
Wang Liang
Wu C.
Zhang Linfeng
Zhou Aojun
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/08/2020
Field of study

Despite the soaring use of convolutional neural networks (CNNs) in mobile applications, uniformly sustaining high-performance inference on mobile has been elusive due to the excessive computational demands of modern CNNs and the increasing diversity of deployed devices. A popular alternative comprises offloading CNN processing to powerful cloud-based servers. Nevertheless, by relying on the cloud to produce outputs, emerging mission-critical and high-mobility applications, such as drone obstacle avoidance or interactive applications, can suffer from the dynamic connectivity conditions and the uncertain availability of the cloud. In this paper, we propose SPINN, a distributed inference system that employs synergistic device-cloud computation together with a progressive inference method to deliver fast and robust CNN inference across diverse settings. The proposed system introduces a novel scheduler that co-optimises the early-exit policy and the CNN splitting at run time, in order to adapt to dynamic conditions and meet user-defined service-level requirements. Quantitative evaluation illustrates that SPINN outperforms its state-of-the-art collaborative inference counterparts by up to 2x in achieved throughput under varying network conditions, reduces the server cost by up to 6.8x and improves accuracy by 20.7% under latency constraints, while providing robust operation under uncertain connectivity conditions and significant energy savings compared to cloud-centric execution.Comment: Accepted at the 26th Annual International Conference on Mobile Computing and Networking (MobiCom), 202

arXiv.org e-Print Archive

Crossref

The 3 + 1 decomposition of conformal Yano–Killing tensors and ‘momentary’ charges for the spin-2 field

Author: Andersson L
Jacek Jezierski
Jezierski J
Jezierski J
Jezierski J
Jezierski J
Jezierski J
Szymon Migacz
Publication venue: 'IOP Publishing'
Publication date
Field of study

Crossref

Recommended from our members

DiSCaMB: a software library for aspherical atom model X-ray scattering factor calculations with CPUs and GPUs.

Author: Adams Paul D
Afonine Pavel V
Chodkiewicz Michał L
Dominiak Paulina Maria
Grosse-Kunstleve Ralf W
Kalinowski Jarosław A
Makal Anna
Migacz Szymon
Moriarty Nigel W
Rudnicki Witold
Publication venue: eScholarship, University of California
Publication date: 01/02/2018
Field of study

It has been recently established that the accuracy of structural parameters from X-ray refinement of crystal structures can be improved by using a bank of aspherical pseudoatoms instead of the classical spherical model of atomic form factors. This comes, however, at the cost of increased complexity of the underlying calculations. In order to facilitate the adoption of this more advanced electron density model by the broader community of crystallographers, a new software implementation called DiSCaMB, 'densities in structural chemistry and molecular biology', has been developed. It addresses the challenge of providing for high performance on modern computing architectures. With parallelization options for both multi-core processors and graphics processing units (using CUDA), the library features calculation of X-ray scattering factors and their derivatives with respect to structural parameters, gives access to intermediate steps of the scattering factor calculations (thus allowing for experimentation with modifications of the underlying electron density model), and provides tools for basic structural crystallographic operations. Permissively (MIT) licensed, DiSCaMB is an open-source C++ library that can be embedded in both academic and commercial tools for X-ray structure refinement

eScholarship - University of California

Recommended from our members

DiSCaMB: a software library for aspherical atom model X-ray scattering factor calculations with CPUs and GPUs.

Author: Adams Paul D
Afonine Pavel V
Chodkiewicz Michał L
Dominiak Paulina Maria
Grosse-Kunstleve Ralf W
Kalinowski Jarosław A
Makal Anna
Migacz Szymon
Moriarty Nigel W
Rudnicki Witold
Publication venue: eScholarship, University of California
Publication date: 01/02/2018
Field of study

eScholarship - University of California

A reduced-precision network for image reconstruction

Author: Abdulkadir A.
Akeley Kurt
Alla Chaitanya Chakravarty R.
Barten Peter G. J.
Bengio Yoshua
Burger H. C.
Caballero Jose
Choi Jungwook
Clevert Djork-Arné
Courbariaux Matthieu
Courbariaux Matthieu
Esser Steven K
Ioffe Sergey
Jain Sambhav R
John
Kainz Florian
Kaplanyan A. S.
Karis Brian
Krishnamoorthi Raghuraman
Lin Darryl D.
Liu Edward
Liu Liyuan
Maas Andrew L.
Migacz Szymon
Nayak Prateeth
Nehab Diego
Reshetov Alexander
Ronneberger Olaf
Salvi Marco
SMPTE.
Wright Less
Zhang Michael R.
Zhou Aojun
Zhou Shuchang
Zhu Chenzhuo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref